Model Selection

High-resolution Image Processing

# High-resolution Image Processing

Unime LLaVA OneVision 7B

UniME is a general embedding learning framework based on multimodal large models, significantly enhancing multimodal embedding capabilities through text discriminative knowledge distillation and hard negative sample-enhanced instruction tuning strategies.

Multimodal Alignment

Transformers English

Unime LLaVA 1.6 7B

UniME is a general embedding learning model based on a multimodal large model, trained with 336×336 image resolution and ranked first on the MMEB leaderboard.

Transformers English

PE Core B16 224

The Perception Encoder is a state-of-the-art image and video understanding encoder trained through simple vision-language learning, achieving top performance across various visual tasks.

PE Core L14 336

A large-scale visual encoder model developed by Meta, achieving state-of-the-art performance in various vision tasks through contrastive pre-training and fine-tuning on synthetic video data

Aimv2 3b Patch14 224.apple Pt

AIM-v2 is an efficient image encoder model compatible with the timm framework, suitable for computer vision tasks.

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase